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(57) Abstract 

The invention relates to a method of producing a hydroxylated triple helical protein in yeast comprising the steps of: introducing to a 
suitable yeast host cell a first nucleotide sequence encoding P4H a subunit, a second nucleotide sequence encoding P4H P subunit and one 
or more product-encoding nucleotide sequences which encode(s) a polypeptide(s) or peptide(s) which, when hydroxylated, form the said 
hydroxylated triple helical protein, each of said first, second and product-encoding nucleotide sequences being operably linked to promoter 
sequences; and culturing said yeast host cell under conditions suitable to achieve expression of said first, second and product-encoding 
nucleotide sequences to thereby produce said hydroxylated triple helical protein; wherein said method is characterised in that the step of 
introducing the first, second and product-encoding nucleotide sequences results in the said first, second and product-encoding nucleotide 
sequences, together with their respective operably linked promoter sequences, being borne on one or more replicable DNA molecules that 
are stably retained and segregated by said yeast host cell during said step of culturing. Transformed yeast host cells and triple helical 
proteins produced in accordance with the method of the invention are also claimed. 
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STABLE EXPRESSION OF TRIPLE HELICAL PROTEINS 

Field of the Invention: 

This invention relates to the production of hydroxylated triple helical 
5 proteins such as -natural and synthetic collagens, natural and synthetic 
collagen fragments, and natural and synthetic collagen-like proteins, by 
recombinant DNA technology. In particular, the invention relates to a 
method for producing hydroxylated triple helical proteins in yeast host cells 
by introducing to a suitable yeast host cell, DNA sequences encoding the 
10 triple helical protein as well as prolyl 4-hydroxylase (P4H), in a manner 

wherein the introduced DNA sequences are stably retained and segregated by 
the yeast host cells. 

Background of the Invention: 

15 The collagen family of proteins represents the most abundant protein 

in mammals, forming the major fibrous component of, for example, skin, 
bone, tendon, cartilage and blood vessels. Each collagen protein consists of 
three polypeptide chains (alpha chains) characterised by a (Gly-X-Y) n 
repeating sequence, which are folded into a triple helical protein 

20 conformation. Type I collagen (typically found in skin, tendon, bone and 

cornea) consists of two types of polypeptide chain termed al(I) and a2(I) [i.e. 
al(I) 2 a2(I)], while other collagen types such as Type II [otl(II) 3 ] and Type III 
[al(HI) 3 ] have three identical polypeptide chains. These collagen proteins 
spontaneously aggregate to form fibrils which are incorporated into the 

25 extracellular matrix where, in mature tissue, they have a structural role and, 
in developing tissue, they have a directive role. The collagen fibrils, after 
cross-linking, are highly insoluble and have great tensile strength. 

The ability of collagen to form insoluble fibrils makes them attractive 
for numerous medical applications including bioimplant production, soft 

30 tissue augmentation and wound/burn dressings. To date, most collagens 
approved for these applications have been sourced from animal sources, 
primarily bovine. While such animal-sourced collagens have been 
successful, there is some concern that their use risks serious immunogenicity 
problems and transmission of infective diseases and spongiform 

35 encephalopathies (e.g. bovine spongiform encephalopathy (BSE)). 

Accordingly, there is significant interest in the development of methods of 



WO 98/18918 



PCT/AU97/00721 



production of collagens or collagen fragments by recombinant DNA 
technology. Further, the use of recombinant DNA technology is desirable in 
that it allows for the potential production of synthetic collagens and collagen 
fragments which may include, for example, exogenous biologically active 
5 domains (i.e. to provide additional protein function) and other useful 
characteristics (e.g. improved biocompatability and stability). 

The in vivo biosynthesis of collagen proteins is a complex process 
involving many post translational events. A key event is the hydroxylation 
by the enzyme prolyl 4-hydroxylase (P4H) of prolyl residues in the Y-position 

10 of the repeating (Gly-X-Y) n sequences to 4-hydroxyproline. This 

hydroxylation has been found to be beneficial for nucleation of folding of 
triple helical proteins. For collagens, it is essential for stability at body 
temperature. Accordingly, the development of a commercially viable method 
for the production of recombinant collagen requires co-expression of P4H 

15 with the alpha chains. For mammalian host cells, co-expression of P4H will 
occur autonomously since these cells should naturally express P4H. 
However, for yeast host cells, which for reasons of cost, ease and efficiency 
are more attractive for expression of recombinant eukaryotic proteins, 
transformation with DNA sequences encoding P4H will also be required. 

20 Since P4H consists of a and P subunits of about 60 kDa and 60 kDa, yeast 
host cells for expression of recombinant collagen will require co- 
transformation with at least three exogenous DNA sequences (i.e., encoding 
an alpha chain, P4H a subunit and P4H 0 subunit) and stability problems 
would therefore be expected if cloned on three separate vectors or, 

25 alternatively, all on episomal type vector. Indeed, even under continuous 
selection pressure, many episomal type vectors suffer stability problems if 
they are large or are present at relatively low copy number. An object of the 
present invention is therefore to provide a method for expressing 
recombinant collagen and other triple helical proteins from yeast host cells 

30 wherein the introduced DNA sequences are stably retained and segregated 
independent of continuous selection pressure. 



Summary of the Invention: 

Thus, in a first aspect, the present invention provides a method of 
35 producing a hydroxylated triple helical protein in yeast comprising the steps 
of: 
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introducing to a suitable yeast host cell a first nucleotide sequence 
encoding P4H a subunit, a second nucleotide sequence encoding P4H P 
subunit and one or more product-encoding nucleotide sequences which 
encode(s) a polypeptide(s) or peptide(s) which, when hydroxylated, form the 
5 said hydroxylated triple helical protein, each of said first, second and 

product-encoding nucleotide sequences being operably linked to promoter 
sequences, and 

culturing said yeast host cell under conditions suitable to achieve 
expression of said first, second and product-encoding nucleotide sequences 
to thereby produce said hydroxylated triple helical protein; 
wherein said method is characterised in that the step of introducing the first, 
second and product-encoding nucleotide sequences results in the said first, 
second and product-encoding nucleotide sequences, together with their 
respective operably linked promoter sequences, being borne on one or more 
15 replicable DNA molecules that are stably retained and segregated by said 
yeast host cell during said step of culturing. 

In a second aspect, the present invention provides a yeast host cell 
capable of producing a hydroxylated triple helical protein, said yeast host 
cell including a first nucleotide sequence encoding P4H a subunit, a second 
20 nucleotide sequence encoding P4H P subunit and one or more product- 
encoding nucleotide sequences which encode(s) a polypeptide(s) or 
peptide(s) which, when hydroxylated, form the said hydroxylated triple 
helical protein, each of said first, second and product-encoding nucleotide 
sequences being operably linked to promoter sequences, and wherein said 
25 first, second and product-encoding nucleotide sequences, together with their 
respective operably linked promoter sequences, are borne on one or more 
replicable DNA molecules that are stably retained and segregated by said 
yeast host cell. 

In a third aspect, the present invention provides a triple helical 
30 protein produced in accordance with the method of the first aspect. 

In a fourth aspect, the present invention provides a biomaterial or 
therapeutic product comprising a triple helical protein produced in 
accordance with the method of the first aspect. 
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Detailed disclosure of the Invention; 

The method according to the invention requires that the first and 
second nucleotide sequences encoding the P4H a and P subunits and the 
product-encoding nucleotide sequences be introduced to a suitable yeast host 
5 cell in a manner such that they are borne on one or more DNA molecules that 
are stably retained and segregated by the yeast host cell during culturing. In 
this way, all daughter cells will include the first, second and product- 
encoding nucleotide sequences and thus stable and efficient expression of a 
hydroxylated triple helical protein product can be ensured throughout the 

10 culturing step and without the use of continuous selection pressure. 

The method according to the invention can be achieved by; (i) 
integrating (e.g. by homologous recombination) one or more of the exogenous 
nucleotide sequences (i.e. one or more of the first, second and product- 
encoding nucleotide sequences) into one or more chromosome(s) of the yeast 

15 host cell, or (ii) including one or more of the exogenous nucleotide sequences 
within one or more vector(s) including a centromere (CEN) sequence(s). 
Alternatively, a combination of these techniques may be used or one or both 
of these techniques may be used in combination with the use of one or two 
high copy number plasmid(s) which include the remainder of the exogenous 

20 nucleotide sequences. For example, the first and second nucleotide 

sequences encoding the P4H a and (3 subunits may be integrated into a host 
chromosome while the product-encoding sequences may be included on 
vector(s) including a CEN sequence or on a high copy number vector(s). 

Preferably, the method of the invention is achieved by including the 

25 exogenous nucleotide sequences within a yector(s) including a CEN 

sequence. Particularly preferred are the CEN sequence-including YAC (yeast 
artificial chromosome) vectors (Cohen et al., 1993) and pYEUra3 vectors 
(Clontech, Cat. No 6195-1). Other vectors including a CEN sequence maybe 
generated by cloning a CEN sequence into any suitable expression vector. 

30 Where one or more of the exogenous nucleotide sequences are 

included in a high copy number vector(s), it is preferred that the high copy 
number vector(s) is/are selected from those that may be present at 20 to 500 
(preferably, 400 to 500) copies per host cell. Particularly preferred high copy 
number vectors are the YEp vectors. 

35 The method according to the invention enables the production of 

hydroxylated triple helical proteins. The term "triple helical protein 11 is to be 
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understood as referring to a homo or heterotrimeric protein consisting of a 
polypeptide(s) or peptide(s) which include at least a region having the 
general peptide formula: (Gly X Y) n , in which Gly is glycine, X and Y 
represent the same or different amino acids (the identities of which may vary 
5 from Gly X Y triplet to Gly X Y triplet) but wherein X and Y are frequently 
proline which in the case of Y becomes, after modification, hydroxyproline 
(Hyp), and n is in the range of 2 to 1500 (preferably 10 to 350), which region 
forms, together with the same or similar regions of two other polypeptides or 
peptides, a triple helical protein conformation. The term therefore 

10 encompasses natural and synthetic collagens, natural and synthetic collagen 
fragments, and natural and synthetic collagen-like proteins (e.g macrophage 
scavenger receptor and lung-surfactant proteins) and as such includes any 
procollagen and collagen (e.g. Types I-XIX) with or without propeptides, 
globular domains and/or intervening non-collagenous sequences and, further, 

15 with or without native or variant amino acid sequences from human or other 
species. Synthetic collagen and fragments encompassed by the term "triple 
helical protein" may also include non-collagenous, non-triple helical domains 
at the amino and/or carboxy terminal ends or elsewhere. 

Accordingly, product-encoding nucleotide sequence(s) suitable for 

20 use in the method according to the invention may be of great diversity. It is, 
however, preferred that the product-encoding nucleotide sequence(s) be 
selected from nucleotide sequences encoding natural collagens and 
fragments thereof, such as COLlAl (D'Alessio et a/., 1988; Westerhausen et 
a7., 1991), COL1A2 (de Wet et al. 1987), COL2A1 (Cheah et al., 1985) and 

25 COL3A1 (Ala-Kokko et al. 1989) and fragments and combinations of these, 
and synthetic collagens and fragments thereof. 

Product-encoding nucleotide sequence(s) which encode natural or 
collagen fragments may encode fragments which include or exclude the N- 
pro-peptide region, the N-telopeptide, the C-telopeptide or the C-propeptide 

30 or various combinations of these. 

Product-encoding nucleotide sequences which encode synthetic 
collagens and fragments thereof, preferably encode a polypeptide(s) or 
peptide(s) of the general formula: (A) r (B) m -(Gly X Y) n -(C) D -(D) p , in which Gly 
is glycine, X and Y represent the same or different amino acids, the identities 

35 of which may vary from Gly X Y triplet to Gly X Y triplet but wherein Y must 
be > one proline, A and D are polypeptide or peptide domains which may or 
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may not include triple helical forming (Gly X Y) n repeating sequences, B and 
C are intervening sequences which do not contain triple helical forming (Gly 
X Y) n repeating sequences, n is in the range of 2 to 1500 (preferably, 10 to 
300) and 1, m, o and p are each independently selected from 0 and 1. 
5 The product-encoding nucleotide sequence(s) may include a 

sequence(s) encoding a secretion signal so that the polypeptide(s) or 
peptide(s) expressed from the product-encoding nucleotide sequence(s) are 
secreted. 

Expression of the product-encoding nucleotide sequence(s) may be 

10 driven by constitutive yeast promoter sequences (e.g ADHl (Hitzeman et al, 
1981; Pihlajaniemi et al., 1987), HIS3 (Mahadevan & Struhl,1990), 786 (no 
author given, 1996 Innovations 5, 15) and PGKl (Tuite et al, 1982), but more 
preferably, by inducible yeast promoter sequences such as GALl-10 (Goff et 
al 1984), GAL 7 (St. John & Davis, 1981), ADH2 (Thukral et al, 1991) and 

15 CUP1 (Macreadie et al, 1989). 

The first and second nucleotide sequences encoding the P4H a and p 
subunits can be of any animal origin although they are preferably of avian or 
mammalian, particularly human, origin (Helaakoski et al., 1989). It is also 
envisaged that the first and second nucleotide sequences may originate from 

20 different species. In addition, the second nucleotide sequence encoding the 
P4H (3 subunit may include a sequence encoding an endoplasmic reticulum 
(ER) retention signal (e.g. HDEL, KDEL or KEEL) with or without other target 
signals so as to allow expression of the P4H in the ER, cytoplasm or a target 
organelle or, alternatively, so as to be secreted. 

25 Expression of the first and second nucleotide sequences may be 

driven by constitutive or inducible yeast promoter sequences such as those 
mentioned above. It is believed, however, that it is advantageous to achieve 
expression of the a and p subunits in a co-ordinated manner using same or 
different promoter sequences with same induction characteristics, but 

30 preferably by the use of a bidirectional promoter sequence. Accordingly, it is 
preferred that the first and second nucleotide sequences be expressed by the 
yeast GALl-10 bidirectional promoter sequence, although other bidirectional 
promoter sequences would also be suitable. 

Multiple copies of the first, second and/or product-encoding 

35 nucleotide sequences may be introduced to the yeast host cell (e.g. present 
on a YAC vector or integrated into a host chromosone). It may be 
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particularly advantageous to provide the product-encoding nucleotide 
sequence(s) in multicopy and, accordingly, it may be preferred to introduce 
the product-encoding nucleotide sequence(s) on a high copy number plasmid 
(e.g. a YEp plasmid). 
5 The introduced first, second and product-encoding nucleotide 

sequences may be borne on one or more stably retained and segregated DNA 
molecules. Where borne on more than one DNA molecule, the DNA 
molecules may be a combination of host chromosome(s) and/or CEN 
sequence-including vector(s) in combination with high copy number 
10 vector(s). Some specific examples of yeast host cells suitable for use in the 
method according to the invention, are transformed with the following DNA 
molecules: 



1. 


YEp-P3 + pYEUra3-ocp, 


2. 


YEp-P3 + pYAC ct0 


15 3. 


YEpCEN-P3 + P YEUra3-aP 


4. 


YEpCEN-P3 + pYACap 


5- 


pYAC-P3 + pYAC ap 


6. 


pYAC-P3 + pYEUra3-ap 


7, 


pYACaP-P3; 


20 


wherein P3 represents a product-encoding nucleotide sequence(s), a 



and P represent, respectively, nucleotide sequences encoding the P4H a 
subunit and P4H P subunit, CEN represents an introduced centromere 
sequence. The pYEUra3 and pYAC vectors include CEN sequences. 

Triple helical protein products produced in accordance with the 

25 method of the invention may be purified from the yeast host cell culture by 
techniques including standard chromatographic and precipitation techniques 
(Miller & Rhodes, 1982). For collagens, pepsin treatment and NaCl 
precipitation at acid and neutral pH may be used (Trelstad, 1982). 
Immunoaffinity chromatography can be used for constructs that contain 

30 appropriate recognition sequences, such as the Flag sequence which is 

recognised by an Ml or M2 monoclonal antibody, or a triple helical epitope, 
such as that recognised by the antibody 2G8/B1 (Glattauer et a/., 1997). 

Yeast host cells suitable for use in the method according to the 
invention may be selected from genus including, but not limited to, 

35 Saccharomyces, Kluveromyces, Schizosaccharomyces, Yarrowia and Pichia. 
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Particularly preferred yeast host cells may be selected from S. cerevisiae, K. 
lactis, S. pombe, Y. lipolyiica and P. pastoris. 

As indicated above, it is particularly preferred that the first, second 
and product-encoding nucleotide sequences be introduced to the yeast host 
5 cell by transformation with one or more YAC vectors. YAC vectors are linear 
DNA vectors which include yeast CEN sequences, at least one autonomous 
replication signal (e.g. ars) usually derived from yeast, and telomere ends 
(again, visually derived from yeast). They also generally include a yeast 
selectable marker such as URA3, TRP1, LEU2, or HIS3, and in some cases, an 

10 ochre suppressor (e.g. sup4-o) which allows for red/white selection in 

adenine requiring strains (i.e. the mutation of the adenine gene being due to 
a premature ochre stop codon). More commonly, two yeast selectable 
markers are included, one on each arm of the artificial chromosome (each 
arm separated by the CEN). This allows selection of only those transformed 

15 hosts containing YACs with introduced sequences of interest within the 

desired restriction cloning site. That is, correct insertion of the sequences of 
interest (e.g. an expression cassette) rejoins the two arms of the restricted 
YAC, thus rendering transformants prototrophic for both markers. YACs 
have been designed to allow for the introduction of large exogenous 

20 nucleotide sequences (i.e. of the order of lOOkb or more) into yeast host cells. 
The present inventors have hereinafter shown that such YACs may be used 
for the stable expression of multiple exogenous nucleotide sequences (e.g. 
nucleotide sequences encoding a natural collagen and both the a and P 
subunits of P4H). 

25 In some embodiments of the invention, it may be preferred that one 

or more (but not all) of the first, second and product-encoding nucleotide 
sequences be introduced to the yeast host cell by transformation with one or 
two YEp vectors. YEp vectors carry all or part of the yeast 2ja plasmid with at 
least the ori of replication. They also include a yeast selectable marker such 

30 as HIS3, LEU2, TRPl, URA3, CUPl or G418 resistance, and often also contain 
a separate ori, generally ColEl, and markers, such as ampicillin resistance, 
for manipulation in E.coli. They show high copy number, for example 20-400 
per cell, and are generally efficiently segregated. Stability during cell 
division is dependent on the vector also containing the REP2/STB locus from 

35 the 2ja plasmid. However, stability is not as good as endogenous 2\x plasmid 
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of the host, particularly when heterologous genes are induced for expression. 
Stability also declines with increasing plasmid size. (Wiseman, 1991). 

The terms "comprise", "comprises" and "comprising" as used 
throughout the specification are intended to refer to the inclusion of a stated 
5 component or feature or group of components or features with or without the 
inclusion of a further component or feature or group of components or 
features. 

The invention will now be described by way of reference to the 
following non-limiting examples and accompanying figures. 

10 

Brief description of the accompanying figures; 

Figure 1 shows, diagrammatically, the construction of the expression 
vector pYEUra3.2.123#39a#5 (labeled pYEUra3-Mpa). 

Figure 2 shows the nucleotide sequence for the COLIII1.6 kb DNA. 
15 Figure 3 shows, diagrammatically, regions of the human collagen III 

gene that have been isolated by PGR. The 1.6kb DNA used in the examples 
hereinafter is also shown. It is to be understood that the other regions shown 
in the figure could substitute for the COLIIIl,6kb DNA in those examples. 

Figure 4 shows, diagrammatically, the construction of the expression 
20 vector YEpFlagCOLIII1.6kb (labeled YEpFlag-C3) . 

Figure 5 shows, diagrammatically, the construction of pYAC5 pa. 

Figure 6 shows, diagrammatically, the construction of pYAC pa-COL 
III1.6 kb. 

Figure 7 outlines the construction of synthetic collagen products. 
25 Figure 8 provides the nucleotide sequence for SYN-C3 together with 

the amino acid sequence of the encoded polypeptide. 

Examples: 

30 Example 1: Construction of a yeast vector for co-ordinated co-expression of 
the q and B subunits of Prolyl-4-hydroxylase. 

Production of yeast expression vector: 

pYEUra3 (Clontech) contains the bidirectional promoter for GALl-10 
expression. Induction by galactose in the absence of glucose results in high 
35 level expression from pGALl of any protein encoded by DNA sequences 

inserted in the correct orientation in the MCS (multiple cloning site) [either 
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Xhol, Sail, Xbal or BamHI sites] provided there is an initiating ATG start 
codon. For pGALlO, expression induced by galactose occurs if the DNA 
sequences to be expressed are inserted in frame with the ATG codon of 
GAL10 when said DNA sequences to be expressed is inserted in the EcoRI 
5 site. 

In order to utilise the EcoRI site for cloning, without the necessity 
that the insert be in frame with the ATG of GAL10 for expression, it was 
necessary to modify pYEUra3 to remove the GAL10 initiation codon. This 
was done as follows. A PCR fragment was generated using pYEUra3 as 

10 template and primers 3465 [S'CTG.TAG.Agg.atc.cCCGGG.TAC.GGA.GC-S', 

where the nucleotides shown in lower case code for a BamHI site] and primer 
1440 [5 ! TTA.TAT.Tga.att.cTC.AAA.AAT.TC-3' where the nucleotides shown 
in lower case specify an EcoRI restriction site]. Primer 1440 introduces an 
EcoRI site preceding the initiating ATG of GAL10 in pYEUra3. The PCR 

15 fragment was restricted with BamHI and EcoRI and cloned into pYEUra3 
similarly digested with BamHI and EcoRI, replacing the BamHI-EcoRI 
fragment containing an ATG start codon with a BamHI-EcoRI fragment 
lacking this ATG, to generate plasmid pYEUra3.2.12. The EcoRI site can then 
be used as a cloning site for which an initiating codon must be provided by 

20 the inserted DNA sequence as with the MCS at the other end of the promoter, 
thus placing it under control of the bidirectional pGALl-10 promoter and 
rendering expression inducible by galactose as are DNA sequences inserted 
in the MCS at the other end of the promoter. Cloning DNA sequences in the 
MCS and in the EcoRI site allows for co-ordinate expression by the 

25 bidirectional promoter when induced by galactose. 

Isolation of DNA molecules encoding the a and ftsubunits of P4H: 

The a subunit of P4H was PCR amplified from cDNA (Clontech 
Human Kidney Quick Clone™ cDNA Cat.#7112-1) using primers 1826 [5- 
TGT.AAA. ATT.AAA.gga.tcc.CAA,AG.ATG.TGG.TAT-3', lowercase encodes 

30 BamHI site, ATG initiating codon for a subunit] and 1452 [5- 

GCCG.gga.tcc.TG. TCA.TTC.CAA.TGA.CAA.CGT-S 1 , lowers case encodes 
BamHI site, TCA translation stop codon]. Two isoforms were obtained and 
cloned into the BamHI site of pBluescript II SK-f [Stratagene Cat.# 212205] 
as storage vector to give pSK-f oc.l (form I) and pSK+a.2 (form II) . There are 

35 no BamHI sites in the DNA encoding the a subunit. The signal sequence for 
secretion is present in the BamHI fragment of both forms. 
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The p subunit of P4H [also known as PDI/protein disulfide isomerase] 
[Pihlajaniemi et al., 1987] was PCR amplified from cDNA (Clontech Human 
Kidney Quick Clone™ cDNA Cat.#7112-1) using primer pairs 2280 [5- 
AC.TGG.ACG.GAT.CCC.GAG.CGC.CCC.GCC.TGC. 
5 TCC.GTG.TCC.GAC.ATG-3'] and 2261 [5' -G.GTT.CTC.CTT.ggt. 

gac.cTC.CCC.TT-3\ where the nucleotides shown in lower case encode a 
BstEII site] for the amino terminal part of the p subunit and primer pairs 2260 
[5-GAA.GGG.GAg.gtc.acc.AAG.GAG JVAC-3', where the lower case 
nucleotides encode a BstEII site] and 1932 [5'-CC.TTC.AGG.ATC.CTA. 

10 TTA.GAC.TTC.ATC.TTT.CAAC.AGC-3 , ] for the carboxy terminal part of the 
P subunit. The two PCR fragments for the p subunit were then ligated 
together following BstEII digestion, to produce a single fragment encoding 
the entire p subunit. This fragment was then amplified using the primers 
2280 [S'-AC.TGG.Acg.gatccC.GAG.CGC.CCC.GCC.TGC.TCC. 

15 GTC.TCC.GAC.ATG-3', where ggatcc encodes a BamHI site, and ATG is the 
initiating codon of the P-subunit] and primer 1932 [S'-CC.TTC.Agg.atc. 
cTA.TTA.GAC.TTC.ATC.TTT.CAC.AGC-3", where ggatcc encodes a BamHI 
site and TTA is the translation stop codon for the P subunit] and then cloned 
into the BamHI site of pBluescript SKII+ to generate the storage vector 

20 pSK+p. Subsequently, the BamHI fragment of pSK+p was amplified by 

using primers 2698 [5'-CTA.GTT.gaa.ttc.TAC.ACA.ATG.CTG.CGC.CGC.GCT. 
CTG.CTG-3', where gaattc encodes an EcoRI site and the ATG. is the 
initiating codon of the p subunit] and 2699 [5 f -GCA.ATG.gaa.ttc.TTA.TTA. 
C AG . TTC . GTG . C AC . AGC . TTT- 3 ' , where gaattc encodes an EcoRI site, and 

25 TTA. TTA. provides two translation stop codons, and GTG. changes a lysine 
[K] residue to a histidine [H] residue to provide a native yeast ER retention 
signal, HDEL (i.e. His.Asp.Glu.Leu) ather than a mammalian KDAEL ER 
retention signal]. The resultant PCR fragment was then blunt end cloned into 
the Srfl site of pCRScript [Stratagene, Cat.# 211190] to generate pCRScriptp. 

30 After retrieving the EcoRI fragment containing the p subunit from 

pCRScriptp by EcoRI digestion, the fragment was again cloned into the EcoRI 
site of pCRScript to generate pCRScriptpEcoRI#4. 
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Construction of yeast expression vector including fragment encoding the a and 
J3 subunit ofP4H: 

The (3 subunit fragment was obtained as an EcoRI fragment from 
EcoRI digestion of pCRScriptpEcoRI#4. This EcoRI fragment was cloned into 
5 the EcoRI site of pYEUra3.2.12 to generate plasmid pYEUra3.2.12p#39. The 
a subunit fragment from pSK+a.l was re-excised from pSKa.l by BamHI and 
cloned into the BamHI site of pYEUra3.2.12p#39 to give 
pYEUra3.2.12p#39a#5] (Figure 1). The p subunit fragment is under control 
of pGALlO and the a subunit fragment is under control of pGALl. This is a 

10 bidirectional promoter and allows co-ordinated induced expression of both 
subunits of prolyl-4-hydroxylase. Both fragments provide a native ATG 
initiating codon for translation. The encoded p subunit has its own signal 
secretion signal and a HDEL endoplasmic retention (ER) seqvience at the 
carboxy terminus of the protein. While the encoded a subunit with its own 

15 signal sequence has no ER retention signal it should, nevertheless, be 
retained through its interaction with the P subunit. 

Example 2: Co-ordinated co-expression of a collagen segment and prolyl-4- 
hvdroxvlase (a and B subunit) and synthesis of hydroxylated collagen Type 
20 III in yeast 

A 1.6 kbp recombinant collagen fragment was generated by PCR 
using primers 1989 [Forward primer 5'-gct.agc.aag.ctt GGA.GCT.CCA. 
GGC.CCA.CTT.GGG.ATT.GCT.GGG-3'] and 1903 [Reverse primer 5'- 
tcg.cga.tct.aga.TTA.TAA.AAA.GCA.AAC.AGG.GCC.AAC.GTC.CAC. ACC-3'] 
25 homologous to a region of the collagen type III alpha I chain (COL3A1). The 
template for isolation of the fragment of type III collagen alpha 1 chain was 
prepared from Wizard purified DNA obtained from a cDNA library [HLll23n 
Lambda Max 1 Clontech Lot#1245, Human Kidney cDNA 5'-Strectch 
Library]. 

30 The actual size of the isolated 1.6 kbp fragment is 1635 bp, 

comprising 1611 bp of COL3A1 DNA flanked either side by 12bp derived 
from the primers. The 1611 bp of COL3A1 DNA corresponds to nucleotides 
#2713-4826 (i.e codon #905-1442) of the full-length coding sequence, 
thereby spanning a portion of the a-helix region, all of the C-telo-peptide, all 

35 of the C-pro-peptide and stop codon.* 1 The nucleotide sequence for the 
COL3A1 DNA is provided at Figure 2. The region covered by the COL3A1 
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DNA is shown at Figure 3. The 1.6kbp fragment has a Nhel [GCTAGC] site 
and a Hindlll [AAGCTT] site added at the 5'-end and a Xbal [TCTAGA] site 
and a Nrul [TCGCGA] site added at the 3' end [where the 5' end is taken to be 
the forward direction of the reading frame, ie the amino terminal end of the 
5 derived coding sequence, and the 3' end is that derived from the reverse 
primer corresponding to the 3' end of the gene and carboxy end of the 
derived amino acid sequence]. This confers portability on the collagen 
fragment. 

The 1.6kbp fragment was cloned into the Smal site of YEpFlagl [IBI 

10 Catalogue #13400] so that the coding sequence is fused in frame with the 
vector expressed Flag protein. This allows for in frame expression of the 
introduced collagen gene fragment as a fusion protein when grown on 
ethanol. The blunt end cloning was performed by ligation of the Smal 
digested vector sequence [gel purified] and the 1.6kbp PCR fragment [gel 

15 purified, non-phosphorylated] at 20°C, in the presence of Smal, to prevent 
recircularisation of the vector alone and reduce the level of false positive 
transformants obtained. There are no Smal, Nhel, Hindlll, Xbal or Nrul sites 
in the fragment of collagen DNA used in the cloning. 

Small scale mini-preparations [prepared using BiolOl columns and 

20 described methods for their use] of DNA from ampicillin resistant 

transformant colonies of E.coli were screened by restriction enzyme analysis. 
10ml cultures rather than 1 ml cultures were required to prepare an adequate 
level of DNA for analysis, as YEpFlag plasmids do not appear to be at a high 
copy number in E.coli. 

25 The fusion protein was of the form : yeast a factor signal sequence for 

direction to the ER and commitment to the yeast secretion pathway, yeast a 
factor propeptide with cleavage sites for kex 2-endopeptidase, resulting in 
removal of all a-factor amino acid residues and generation of a free Flag- 
tagged amino terminal end, Flag peptide for detection and tagging of the 

30 fusion protein (8 amino acid residues), linker peptide (4 amino acid 

residues), collagen helix (255 amino acid residues), collagen C-telopeptide 
[G-tel] (25 amino acid residues) and C-propeptide [C-pro] (255 amino acid 
residues) (for aid in formation of triple helix). The expected Flag-tagged 
protein consists of 547 amino acid residues with a expected MW of ~60kDa]. 

35 Expression of the fusion protein in YEpFlagl is under the control of 

the ADH2 promoter which is repressed by glucose but active in the presence 
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of ethanol [a by-product of glucose metabolism]. There are multiple copies of 
the vector in individual yeast transformants due to the presence of the yeast 
2 micron origin of replication in the vector, which leads to elevated 
expression of the 1.6 kbp PCR collagen fragment when glucose repression is 

5 lifted by consumption of glucose during growth. One unique feature of this 
cloning scheme is that inserts of the 1.6kbp collagen fragment in the wrong 
orientation will not form fusion products as the terminal leucine residue 
preceding the stop codon is coded by the codon AAT. In reverse orientation 
this generates a stop codon TAA. The result of incorrect insertion is the 

0 addition of only a single leucine coding codon [the stop codon TAA in 

reverse is AAT] following the Flag sequence before the protein is terminated. 

The amino acid sequence of the Flag-tagged fusion protein at the 
point of fusion is N-Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys-[Flag]-Ala-Ser-Lys- 
Leu-[linker]-Gly-Ala-Pro-Gly-Pro-Leu-Gly-Ile-Ala-[a-helix]. 

5 The YEpFlag collagen construct [hereinafter referred to as YEpFlag 

COLIII1.6kb; Figure 4] was introduced into a tryptophan prototrophic yeast 
strain such as for example BJ3505 [a pep4::HIS3 prb-1.6R HIS3 lys2-208 trpl- 
-101 ura3-52 gal2 canl], BJ5462 [ a ura3-52 trpl leu2-l his3-200 pep4::fflS3 
prb-1.6R canl GAL], (YGSG) JHRYl-5Da [a his4-519 ura3-52 leu2-3 leu2-112 

:0 trpl pep4-3] or KRYDl[ BJ3505xBJ5462 diploid] by transformation using 
electroporation, lithium acetate or spheroplast regeneration. Tryptophan 
auxotroph transformants were obtained, grown to high cell density in 
selective media [lacking tryptophan] followed by transfer to YPHSM, YEPM 
or YEPD or YEPGal, YEPE as described in the protocol provided with the 

15 YEpFlag expression system [IBI catalogue #13400]. At 3-9 days following 

inoculation 1ml aliquot's of culture were made and pellets and supernatants 
separated by centrifugation at 13000rpm in a benchtop centrifuge. Total yeast 
pellets were resuspended in lOO^il of gel loading buffer [5xSDS] containing 
PMSF [0.002M], vortexed vigorously for 2 minutes, and boiled for 5 minutes. 

10 From the pellets 900 p.1 supernatants were retained to which lOOjil 

5xSDS/0.002M PMSF was added, and treated as described for the pellets. For 
both pellets and supernatants 20|li1 aliquot's were assayed by Western blot 
analysis of SDS-PAGE yeast total protein or of supernatants [media] following 
transfer to nitrocellulose and prehybridisation of the filters in blotto. Western 

J5 blotting was carried out using a-Flag MAb Ml [against N-terminal free Flag] 
(International Biotechnologies Inc., (Eastman Kodak) Cat. No. IB13001) or M2 
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[against Flag] (International Biotechnologies Inc., (Eastman Kodak) Cat. No. 
IB13010). 

Western blots revealed the presence of a protein band of 
approximately 60kDa. This is the expected size of a protein fusion containing 
5 Flag-helix-C-tel-C-pro. After prolonged incubation the Flag responsive 
antibodies detected the appearance of the fusion product in the media. 
Detection in both pellet and media supernatant with Ml antibody 
demonstrates that the a factor leader has been completely removed. No 
precursor forms with a factor pro-region [glycosylated or not] were observed. 

10 No band corresponding to 60kDa was obtained which hybridised to 

Ml or M2 with proteins obtained from untransformed yeast hosts. When 
yeast transformed with YEpFlag [no insert] alone was vised, bands were 
obtained in pellets, but only with M2 MAb. These bands correspond to un- 
secreted a-proregion-with C-terminal Flag and various glycosylated forms of 

15 the same. No Flag is detected in supernatants but this is to be expected as it 
is only 8 amino acids long. No expression from the ADH2 promoter for any 
construct is observed in the presence of glucose. 

YEpFlagCOLIII 1.6kb was also co-introduced [co-transformed] into 
yeast strains such as BJ5462 and KRDYl which are capable of growth on 

20 galactose along with pYEUra3 [Clontech ][pYEUra3 and its derivatives 

contain the bidirectional GALl-10 promoter. Both the ADH2 and GALl-10 
promoters are repressed by glucose. The GALl-10 promoter is induced by 
galactosel] or pYEUra3.2.12 [a modification of the Clontech parent vector 
which allows cloning of genes into an EcoRI site without the necessity of the 

25 introduced gene being in the correct reading frame] or pYEUra3.2.120#39 [in 
which the DNA encoding the (3 subunit (equivalent to protein disulfide 
isomerase of prolyl-4-hydxoxylase is cloned into the EcoRI site of 
pYEUra3.2.12 under control of GAL10 promoter] or pYEUra3.2.120#39a#5 
[in which the DNA encoding the a subunit of P4H is cloned into the BamHI 

30 site of pYEUra3.2.12p#39 under control of the GAL1 promoter]. 

Transformants were selected on media lacking tryptophan or uracil or 
lacking both tryptophan and uracil. As previously done with tryptophan 
transformants obtained above with YEpFlag or YEpFlagCOLIIU.6kb, 
transformants were grown in selective media prior to growth in YPHSM, 

35 YEPM, YEPD ,YEPG or YEPE and after 4 days galactose was added to a final 
concentration of 2%, 0.5% or 0.2%. Total yeast protein or supernatants were 
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analysed by Western blot analysis as described above except that a third 
MAb [5B5 against the p subunit] (Dako Corporation, Cat. Np> M877) was also 
used. 

Western blot analysis revealed the presence of a —60kDa band in trp" 
or trp ura" yeast transformed with YEpFlag COLIII1.6kb but not YEpFlag 
alone when screened with MAb Ml or M2 as was previously the case with 
transformants obtained with single plasmid transformation. 

Analysis also showed the presence of a ~ 60kDa band in ura" or ura" 
trp but not trp" yeast transformants transformed with pYEUra3.2.12(3#39 or 
pYEUra3.2.12p#39a#5 or cotransformed with same plus YEpFlag or 
YEpFlagCOLIII 1.6kb when screened with anti-p subunit MAb 5B5 but only 
following induction with galactose and only when galactose was between 0.2 
and 0.5% and not at 2%. The expected size for the P subunit is also 60kDa. 
This band is not detected by Ml or M2 in uracil auxotrophic yeast 
transformed with pYEUra3.2.12p#39 or pYEUra3.2.12P#39a#5 alone. 

At the time ofthe experimentation, an antibody for the detection of 
expression of the a subunit from the bidirectional GAL1,10 promoter in 
pYEUra3.2.12p#39a#5 was not available but as the promoters for both GALl 
and GAL10 are normally co-induced and under the control of the same UAS 
(upstream activation sequence) in yeast it was assumed that the a subunit is 
also transcribed and expressed where the P subunit is demonstrated to be 
expressed. To test this, the capacity for pYEURa3.2.12p#39a#5 / YEpFlag 
COLIII 1.6kb co-transformants induced with 0.2% galactose following at least 
4 days growth on YPHSM to produce functional P4H was examined. 
Galactose was added following the clear demonstration of the expression of 
Flag-collagen by a positive response of yeast protein to Ml or M2 in Western 
blots and the absence of a response to MAb 5B5 against p subunit. Following 
induction with galactose [16hrs] protein was again examined and the 
presence of Ml or M2 responsive bands and 5B5 responsive bands were 
separately demonstrated. Protein was transferred to PVDF membrane 
following SDS-PAGE and the membrane sliced into strips. Membrane strips 
containing protein from the region corresponding to the 60kDa responsive 
area was subject to hydrolysis and amino acid analysis. Amino acid analysis 
revealed the presence of hydroxyproline in this material from co- 
transformants of yeast co-transformed with YEpFlagCOLIII1.6kb and 
pYEUra3.2.12p#39a#5 after induction with 0.2% galactose but no 
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hydroxyproline was detected with protein from control samples with or 
without galactose. 

The media used contains peptone derived from bovine protein 
hydrolysates but no hydroxyproline was found in total yeast grown on this 
5 media nor in any of the singly transformed yeast [one vector alone]. Only in 
yeast co-transformants was hydroxyproline detected in the 60kDa bands and 
then only when galactose was added. Uninduced co-transformants [no 
galactose] in which Flag detected collagen was expressed did not contain any 
hydroxyproline in the 60kDa band excised from PVDF following transfer. 
10 Hydroxyproline was only found in the 60kDa region and not in other regions 
of the blot. 

The clear evidence then, is that following galactose induction of 
pEUra3.2.12p#39a#5 a product is produced in yeast which is capable of 
hydroxylating the proline residues of a co-expressed Flag-tagged collagen 

15 fragment. Such activity is not found in yeast untransformed or transformed 
with pYEUra3.2.12P#39 [no a subunit] or in uninduced yeast grown on 
ethanol or glucose. 

A clear advantage of this method of co-expression for the production 
of hydroxylated collagen in yeast is the co-ordinated expression of the three 

20 genes that is possible in co-transformants. Another advantage is that the a 
and p subunits themselves are co-ordinately expressed. A third advantage is 
that the aP expression vector (i.e. pEUra3.2.12p#39a#5) contains a 
centromere sequence and behaves as a mini-chromosome. It is therefore very 
stable and does not require selection pressure to be maintained for its 

25 stability. The removal of selection pressure in yeast does not appear to effect 
the stability of the YEpFlag collagen construct as it is in very high copy 
number, but clearly the ability to only be concerned with maintenance of a 
single plasmid in the absence of selection pressure is important rather than 
balancing the effects of selection pressure on the stability of three separate 

30 plasmids if the a, P and collagen fragments were separately cloned on 

multicopy vectors. Also the use of a bidirectional promoter to express the a 
and P subunits simultaneously is of benefit rather than expressing them from 
different promoters on different plasmids in different amounts. The a 
subunit probably requires the synthesis of equal or higher levels of the P 

35 subunit for its correct assembly into functional P4H (a 2 P 2 ) enzyme and co- 
ordinated expression appears to be an efficient mechanism to ensure this. 
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*' ( Codon numbering for collagen typo III alpha 1 chain: ATG, codon #1; codon #l-codon #24, signal sequence; 
codon #25-codon #116, N-pio-poptide soquonce. codon #117-codon #130, N-lnlo-pnptido soquonco; codon #131- 
codon # 1101, a-holix sequence; codon #1162-codon #118G*, C-lelo- peptide; codon #1107-codion #1441, C-pro- 
5 peptide; codon #1442, stop] and | corresponding nucleotide numbering for collagen type 111 alpha 1 chain: nucleotide 
#1-72. signal sequence; nucleotide #73-340, N-pro-peptide sequerico; nucleotide #349-390. N-tolo-peptido; nucleotide 
#391-nt#3983. a-helix region; nucleotide #3984-4058, C-telo- pop tide; nucleotide #4059-4823. C-pro-peplido 
sequence; nucleotide #4824-4820, stop codon}. 

10 Example 3: Use of Yeast Artificial Chromosomes FYACsl for co-ordinated 
expression of the a and B subunits of Prolvl-4-hydroxvlase rP4Hl. 

pYAC5 [11454bp] (Kuhn and Ludwig, 1994) was digested with BamHI 
to liberate the HIS3 gene [1210bp] from between the 2 telomere ends and 
with Sall-Nrul to produce two fragments [left arm: fragment 1, 5448bp & right 

15 arm: fragment2, 4238bp] which were gel purified. Fragment 1 was BamHI- 

telomere end -E.coli ori-p-lactamase gene fampicillin-resistance] -TRPl-ARSl- 
CEN4-tRNAsup-o-SalI. Fragment 2 was BamHI-telomere end-URA3-NruL 

pYEUra3.2.12p#39a#5 was digested with Sall-EcoRV to produce a 
P4H expression cassette fragment of the form Sall-Xbal-BamHI-a-ATG- 

20 BamHI-pGALl-10-EcoRI-ATG-p-EcoRI-SmaI-EcoRV [4864bp] which was gel 
purified. The expression cassette fragment encoding the a and (3 subunits of 
P4H under the control of a galactose inducible bidirectional promoter was 
ligated with fragments 1 and 2 of the BamHI-Sall-NruI digested pYAC5 and 
the ligation mix used to transform the following yeast strains: BJ2407 [ a/a 

25 prbl-11222/prbl-1122 prcl-407/prcl-407 pep4-3/pep4-3 Ieu2/leu2 trpl/trpl 
ura3-52/ura3-52 ], KRYDl [ a/a ura3-52/ura3-52 trpl-AlOl/trpl lys2-208/LYS2 
HIS3/his3A200 gal2/GAL2 canl/canl pep4::HIS3/pep4::HIS3 
prblAl.6R/prbA1.6R J, GYl [ a leu2 adel trpl ura3 ], JHRYl-5Da [ a his4-519 
ura3-52 leu2-3 leu2-112 trpl pep4-3 ], and YPH150[ o/a ura3-52/ura3-52 lys2- 

30 801a/lys2-801a adel-lOlo/adel-lOlo leu2Al/leu2Al trpl-A63/trpl-A63 

his3A200/his3A200 ] using the method for lithium acetate transformation. 
Yeast strains were also transformed with pYAC5 digested with BamHI and 
undigested pYACS. 

Ura + Trp + co-transformants were obtained for all strains where the 

35 two fragments of pYAC5 each carrying either TRPl [SalI-CEN4-TRPl-BamHI] 
[fragment 1] or URA3 [NruI-URA3-BamHI] [fragment 2] as the selectable 
marker for transformation each on one arm of the YAC, had been linked 
together by the insertion of the P4H expression cassette into the Sall-EcoRV 
sites. This vector was designated pYACSpa (Figure 5). The vector was of the 
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form BamHMelomere-URA3-NruI/EcoRV [both sites destroyed] -p-ATG- 
pGALlO-l-ATG-a-SalI-tRNAsup-CEN4-ARSl-TRPl-AMPr-ori-telomere- 
BamHI. The presence of the CEN4 sequence means the vector behaves as a 
stable chromosome during replication and is segregated at least 1 copy per 
5 cell at mitosis and meiosis [as was the case for pYEUra3.2.12P#39ot#5]. The 
telomere ends mean that the vector is linear and stable. 

Transformants and controls [pYAC5 alone (circular), pYAC5 
linearised by BamHI digestion] were replica plated onto nitrocellulose filters 
laid over selective media [SD Complete lacking uracil and tryptophan] or rich 

10 media [YEpD] and incubated 2-5 days at 30C till confluent. Filters were 

transferred to selective media containing galactose [2%] instead of glucose or 
rich media containing galactose [2%] as well as glucose media plates and 
grown at 30C for periods between 2h-72h. At the end of incubation colonies 
were lysed on 0.1%SDS-0.2N NaOH-0.1% p-mercaptoethanol, washed with 

15 water and filters blocked with Blotto. Production of the a and p subunits of 
P4H was ascertained by hybridising the treated filters with MAbs specific for 
the a [MAb 9-47H10] (ICN Biomedical Inc. Cat. No. 631633) and p [MAb 5B5] 
subunits. Colonies transformed with pYACspa and induced with galactose 
showed hybridisation with MAbs against the subunits of P4H demonstrating 

20 co-ordinated production of a and P from the bi-directional GAL 1-10 

promoter. Controls filters and control yeast did not produce a response to 
P4H MAbs. Yeast transformants carrying pYACspa grown on glucose [a 
repressor of the bi-directional GAL 1-10 promoter] also did not produce a 
positive response. 

25 Positive transformants identified in the above screening procedure 

were precultured/grown in 10ml liquid culture media containing selective 
media lacking ura and trp or rich media [containing glucose, glycerol or 
raffinose]. Aliquots were transferred to inducing media [selective or rich] 
containing 0.2-2% galactose. Where glucose was the carbon source pellets 

30 were washed in sterile water prior to induction. After 2-20h further growth 
at 30C cell pellets were collected, suspended in loading buffer and total yeast 
protein separated on SDS-PAGE and western blotted. Filters were blocked 
with blotto and hybridised with MAbs against both of the P4H subunits. 
Only those yeast transformants carrying pYACspa and induced with 

35 galactose gave the expected 60kDa bands for a and p subunits. This 

demonstrates that the P4H expression cassette has been functionally inserted 
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into pYAC5. The advantage of having the P4H cassette in the pYAC is 
twofold; [1] as with the case of pYEUra3.2.12P#39a#5 the presence of the 
CEN sequence means that the vector is stably maintained in this system 
when selection pressure is removed for growth in rich media, which 
5 increases yield through increased cell density, and [2] the pYAC5Poc 

construct allows for the subsequent insertion of multiple and different triple 
helical protein expression cassettes. 

Example 4: Co-expression of collagen/triple helical protein fragmentfsl 
10 expressed on a multicopy plasmirl and P4H subunits in yeast transformants 
carrying pYACSBa. 

Yeast host strains containing pYACspa or pYAC5 were transformed 
with YEpFlagColIII 1.6kb or YEpFlag alone. The form of the collagen bearing 
vector was circular and multicopy. In this instance, as the YEpFlagCOLIII 

15 1.6kb and the pYAC constructs both contain the same selectable marker, 

yeast transformants producing Flag tagged-collagen were identified by colony 
hybridsation with MAbs against Flag [Ml or M2]. Colonies were also 
screened for whether they carried extra copies of bla gene [multicopy] by 
identifying those colonies producing increased levels of p-lactamase by 

20 PADAC assay (Macreadie et a7., 1994). In other examples, the multicopy 

plasmid could utilise a different selectable marker other than URA3 or TRP1 
found on each arm of the YAC. Various co-transformant types canying 
pYACSpot and YEpFlag COLIII 1.6kb were assayed as in Example 1 for 
collagen production, P4H subunit production, and P4H activity. Those co- 

25 transformants containing pYACSpa plus YEpFlag COLIII 1.6kb were then 
screened as described in the previous example for hydroxylated collagen to 
identify 60kDa bands in western blots responding to MAbs against the a and 
p and Flag following induction. The a and p subunits were only identified 
following galactose induction. Hydroxylated protein was only identifed 

30 following induction of both the a and P subunits of P4H. 
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Example 5: Introduction of collagen expression cassette into pYAC5 and 
pYACSBou 

YEpFlag was linearised by digestion with Seal which cuts at a single 
recognition site in the ampicillin resistance gene for p-lactamase [bla]. There 
5 are no Seal sites in the 1.6kb collagen fragment insert so Seal could also be 
used to linearise YEpFlagColIII 1.6kb. Linear DNA was used to transform 
yeast containing pYAC5 or pYACspa. Yeast transformants producing Flag 
tagged-collagen were identified by colony hybridsation with MAbs against 
Flag [Ml or M2]. Colonies carrying extra copies of bla gene [multicopy] were 

10 also identified. Those colonies producing increased levels of p-lactamase by 
the PEDAC assay were found to have inserted a copy of YEpFlag COLIII 1.6kb 
into the pYAC5 or pYAC5P<x vector of the host strain and correspond to those 
colonies positive to MAbs Ml or M2. The increased p-lactamase activity is a 
result of gene amplification resulting from homologous recombination 

15 between the linearised bla gene on YEpFlagCOLIII 1.6kb and the bla gene on 
pYAC. The new plasmids formed by insertion into pYAC5 or pYAC5Pa of 
the YEpFlag COLIII 1.6kb vector were designated pYAC-COLIII 1.6kb and 
pYAC aP-COLIII 1.6kb (Figure 6). Expression experiments were performed 
and only those strains carrying all 3 genes on the YAC [pYAC Pa -COLIII 

20 1.6kb] and induced for P4H with galactose produced hydroxylated collagen. 

Example 6: Cloning and expression of a synthetic collagen protein. 

A strategy is described for the generation of "synthetic/novel" 
collagen proteins involving the in vitro assembly of synthetic 

25 oligonucleotides repeat sequences encoding the peptide GPP.GPP.GXY 
(where XY = LA, ER, PA or AP). The synthetic collagen sequences are 
engineered to contain a high percentage of proline residues as this residue 
has been shown to confer thermal stability to collagen molecules. The 
residue pairs chosen for the XY position (i.e. LA, ER, PA or AP), are selected 

30 since they appear in statistically higher amounts in fibrillar collagens. 

Mixtures of synthetic oligonucleotides encoding GPP.GPP.GXY may 
be joined together to generate DNA fragments of discrete lengths, encoding 
synthetic collagen proteins of discrete molecular size and with different 
physical characteristics. These synthetic gene segments can be cloned into 

35 various expression vectors for subsequent production of a collagen product in 
yeast. An outline of the strategy for construction of a synthetic 
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oligonucleotide encoding a collagen is shown in Figure 7 where XY is shown, 
for the purposes of exemplification only, as ER, LA, AP, PA. 

Such synthetic oligonucleotides have been synthesised and several 
libraries containing gene segments of various lengths have been generated by 
5 ligating these oligonucleotides together (maximum visible DNA length 
approx. 1000 base pairs coding for a polypeptide of — 350 amino acid 
residues). 

Example 7: Construction of a synthetic hvdroxylated triple helical protein 
10 for stable expression in yeast. 

A region of Type III collagen was selected for its known capacity to 
bind and activate platelets [through an integrin binding site near -Gly-Leu- 
Ala-Gly-Ala-Pro-Gly-Leu-Arg]. A region of 5 GLY-X-Y repeats to the N- 
terminal side and 7 GLY-X-Y repeats to the C-terminal side were also 

15 included to form the basic repeat unit for inclusion in the synthetic fragment. 
The sequence of the repeat was GGKGDAGAPGERGPP-GLAGAPGLR- 
GGAGPPGPEGGKGAAGPPGPP. This corresponds to residues 637-681 
(nucleotides 1909-2043) in the COL3A1 gene [with Met =1]. At the 5'-end of 
the DNA an EcoRI site and Nhel site was included such that the Nhel site 

20 provided an initiating methionine. Thus the sequence at the amino end is 
MGAPGAP, where GAPGAP is the natural sequence flanking the repeat in 
COL3A1. The repeat was linked to a second repeat by a linker which 
introduced a Bspl20I site for later manipulations and provided the sequence 
GGP between the first and second repeat unit. The second repeat was linked 

25 to a third repeat by a linker which introduced a BssHII site [again for later 
manipulation] and resulted in the amino acid sequence GAR. The third 
repeat was flanked by 2 additional GPP triplets, a GCC triplet and finally 
GLEGPRG. This was a result of including coding sequence that provided for 
Xhol, SacII and Nhel sites. These were included for flexibility of cloning at 

30 later stages. The Nhel site provides an in frame stop codon. 

The synthetic fragment was produced by PCR from primers against 
COL3A1 in 3 pieces initially. Fragment 1 was EcoRI-NheI-Met-[GAP]2- 
[REPEAT ]1-Bspl201. The primers for this were 5'-aattccatg- 
ggtgctccaggtgctcc-3 , [up] [primer U101] and S'-ggcc-acctggtggacctggtgg-S' 

35 [down] [primer D101]. The second PCR fragment used primers S'-ggccc- 
ggtggtaagggtgacgc-3' [up] [primer U102] and 5'-cgcgc-acctggtggacctgg-3' 
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[down] [primer D102]. For the 3rd repeat primer pairs used were 5'-cgcgc- 
ggtggtaagggtgacgctgg-3' [up] [primer U103] and 5'-acaaccctggtggacctggtggacc- 
^S^SB^ CC ^BS&^S&~ 3 ' [down] [primer D103]. The three fragments form the PCR 
reactions were gel purified and ligated together. The DNA from the ligation 
5 mixture was then used as the template for a further round of PCR using 

primer U101 and a new primer at the 3' end [5'-ctagccccgcggaccctcgagaccaca- 
acaaccctggtgg-3' ] [down] [primer D104]. A band of approximately 500 bp 
was produced and gel purified, digested with EcoRI-Nhel and ligated to 
pYXl41 (Ingenous Cat. No MBV-025-10) [LEU2-CEN-p786] also digested with 

10 EcoRI-Nhel before being transformed into E.coli. Transformants were 
screened by PCR using primers for the second fragment and DNA from 
positive colonies were miniprepped and screened by enzyme digestion with 
EcoRI-Nhel for the presence of an insert of approximate 500 bp. This storage 
vector was designated pYX-SYN-C3-l. The EcoRI-Nhel fragment was 

15 . transferred to pYX243 [2u-LEU2-pGAL] (Ingenous Cat. No MBV-035-10) to 
give pYX-SYN-C3-2 and this plasmid was introduced into a yeast host cell 
including neucleotide sequence for the carrying the P4H a and P subunits 
[either pYEUra3.2.12p#39a#5 or pYACa0]. Expression following galactose 
induction was determined by using a MAb 2G8/B1 (Werkmeister & Ramshaw, 

20 1991) which recognises the sequence GLAGAPGLR. An EcoRI-SacII fragment 
from pYX-SYN-C3-2 was also introduced into the EcoRI-SacII of YEpFlag to 
produce YEpFlag-SYN-C3 and this too was introduced into a yeast host cell 
expressing P4H on induction by galactose. A product of approximately 18 
kDa [the expected size of SYN-C3] was detected in yeast induced with 

25 galactose by Western blotting. 

The nucleotide sequence for SYN-C3 is provided at Figure 8 together 
with the amino acid sequence of the encoded product. 

Example 8: The use of yeast other than Saccharomyces cerevisiae. 

30 The GAL1-10 promoter is functional in Kluyveromyces whilst the 

ADH2 promoter is constitutively expressed in S. pombe. By shifting the 
expression cassettes to appropriate vectors, other yeast hosts can be used. 
K. lactis for instance has been shown in some instances to display less 
proteolytic activity for recombinant products. Alternatively, P. pastoris could 

35 be used for multiple integration of the expression cassette for a p into the 
chromosome. 
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For expression in P. pastoris, the nucleotide sequence described in 
the previous example encoding the synthetic triple helical protein [SYN-C3] 
was inserted into the P. pastoris vector pPIC9 (Invitrogen, Cat. No. K1710-01) 
at the EcoRI-NotI sites [pPIC-SYN-C3]. Following digestion with either Bglll 
5 or Sail, the plasmid was introduced into P.pastoris where it was integrated at 
either the AOXl or HIS4 sites for Bglll or Sail respectively. The nucleotide 
sequences encoding the P4H a and (3 subunits were also introduced into P. 
pastoris using the EcoRI site of pHIL-D2 (Invitrogen, Cat. No. K1710-01) for 
the p subunit and integration at HIS4 and the Bamlll site of pHIL-Sl 
10 (Invitrogen, Cat. No. K1710-01) for the a subunit and subsequent integration 
HIS4. All three expression cassettes were under the control of the AOXl 
promoter and induced by methanol. 

Example 9: Enhanced expression of prolv-4-hydroxylase a and B subunits 
15 from the GALl-10 promoter by use of yeast with different backgrounds for 
control of galactose induced expression. 

The plasmid pYEUra3.2.12P#39a#5 [encoding the a and P subunits 
of P4H under the control of the GALl-10 bidirectional promoter] can be 
introduced into a yeast host cell with the following genotype : a or a, ura3 
20 trpl egdl bttl. In these cells, the absence of the products for the EGD1 and 
BTTl genes results in higher levels of galactose induced expression from 
GAL4 dependent promoters such as GAL2, GAL4, GAL 7, GALl-10, MELl (Hu 
& Ronne, 1994). 

Another mechanism for enhanced expression is the use of a yeast 
25 host cell carrying multiple copies of the GAL4 (Johnston & Hopper, 1982) 

positive transcriptional activator under its own controlled induction by 

galactose. This leads to enhanced expression as there is no limit to the 

availability of the transcriptional activator for the GALl-10 promoter. 

Similarly, the yeast host cell could contain multiple copies of the SGE1 gene 
30 (Amakasu et al., 1993) which also leads to enhanced transcription from 

galactose induced promoters. 

Various combinations of these backgrounds could also be utilised; 

that is egdl bttl SGEl mc or egdl bttl GAL4 mc or egdl bttl SGEl mc GAL4 mc 

[where mc represents multiple copies]. 
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Example 10: Expression of collagen from promoters other than ADH2. 

The collagen encoding nucleotide sequence in YEpFlag COL 1.6kb 
can be excised as a Nhel or Hindlll- Xbal or Nrul fragment for insertion into 
other fusion vectors under the control of other promoters. Alternatively, the 
5 pADH2-oc signal-A-proregion-Flag collagen cassette can be excised as a Nael 
or Sad - Bglll or Xbal or Spel or SnaBI or NotI, for example, and introduced 
into an appropriate vector such as YEplacl81 (Gietz & Sugino, 1988) or 
pMHl58 (Heuterspreute et al., 1985) for expression in different copy numbers 
and host backgrounds or into vectors with CEN sequences. Alternatively, 

10 CEN sequences can be introduced into the YEpFlag vector itself. The cassette 
can also be removed without the ADH2 promoter using Nrul and introduced 
into an appropriate vector behind an appropriate promoter. 

Collagen encoding nucleotide sequences can be expressed using the 
CUP1 promoter in vectors such as pYELCS (Macreadie et al., 1989) as an 

15 alternative to the ADH2 promoter. This promoter is induced by addition of 
copper (i.e. copper sulfate) and may have the advantage of an increased 
reducing environment and enhancement of P4H activity during co- 
expression. A second promoter that can be used is the TIP1 promoter which 
is induced by cold shock. Here the stability of the expressed collagen may be 

20 enhanced without the need for hydroxylation by inducing expression by 
shifting growing yeast from 30°C to 18°C. 

The method according to the invention provides for the stable 
expression of triple helical proteins from yeast host cells. The products of 
the method may be natural and synthetic collagens, natural and synthetic 

25 collagen fragments and natural and synthetic collagen-like proteins. 

Synthetic products may show enhanced or novel functions (e.g. inclusion of 
RGD and/or YIGSR sequences from fibrorectin and laminin). The products 
may be used in a wide range of applications including bioimplant 
production, soft and hard tissue augmentation, wound/burn dressings, 

30 sphincter augmentation for urinary incontinence and gastric reflux, 

periodontal disease, vascular grafts, drug delivery systems, cell delivery 
systems for natural factors and as conduits in nerve regeneration. 
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It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shown in 
the specific embodiments without departing from the spirit or scope of the 
30 invention as broadly described. The present embodiments are, therefore, to 
be considered in all respects as illustrative and not restrictive. 
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Claims: 

1. A method of producing a hydroxylated triple helical protein in yeast 
comprising the steps of: 

introducing to a suitable yeast host cell a first nucleotide sequence 
5 encoding P4H a subunit, a second nucleotide sequence encoding P4H P 
subunit and one or more product-encoding nucleotide sequences which 
encode(s) a polypeptide(s) or peptide(s) which, when hydroxylated, form the 
said hydroxylated triple helical protein, each of said first, second and 
product-encoding nucleotide sequences being operably linked to promoter 

10 sequences, and 

culturing said yeast host cell under conditions suitable to achieve 
expression of said first, second and product-encoding nucleotide sequences 
to thereby produce said hydroxylated triple helical protein; 
wherein said method is characterised in that the step of introducing the first, 

15 second and product-encoding nucleotide sequences results in the said first, 
second and product-encoding nucleotide sequences, together with their 
respective operably linked promoter sequences, being borne on one or more 
replicable DNA molecules that are stably retained and segregated by said 
yeast host cell during said step of culturing. 

20 

2. A method according to claim 1, wherein the product-encoding 
nucleotide sequence(s) is/are nucleotide sequence(s) encoding a natural 
collagen or fragment thereof, 

25 3. A method according to claim 2, wherein the product-encoding 

nucleotide sequence(s) is/are selected from COLlAl, COL1A2, COL2A1 and 
COL3A1 and fragments. 

4. A method according to claim 3, wherein the product-encoding 
30 nucleotide sequence(s) is COL3A1. 

5. A method according to claim 1, wherein the product-encoding 
nucleotide sequence(s) is/are a nucleotide sequence(s) encoding a synthetic 
polypeptide(s) or peptide(s) of the general formula: (A) r (B) m -(Gly X Y) n -(C) 0 - 

35 (D) p , in which Gly is glycine, X and Y represent the same or different amino 
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acids, the identities of which may vary from Gly X Y triplet to Gly X Y triplet 
but wherein Y must be > one proline, A and D are polypeptide or peptide 
domains which may or may not include triple helical forming (Gly X Y) n 
repeating sequences, B and C are intervening sequences which do not 
5 contain triple helical forming (Gly X Y) n repeating sequences, n is in the 
range of 2 to 1500 and 1, m, o and p are each independently selected from 0 
and 1. 

6. A method according to any one of the preceding claims, wherein the 
10 first and second nucleotide sequences are expressed from a bidirectional 

promoter sequence. 

7. A method according to claim 6, wherein the bidirectional promoter 
sequence is the yeast GAL1-10 promoter sequence. 

15 

8. A method according to any one of the preceding claims, wherein the 
first and second nucleotide sequences are of avian or mammalian origin. 

9. A method according to claims 8, wherein the first and second 
20 nucleotide sequences are of human origin. 

10. A method according to any one of the preceding claims, wherein the 
second and product-encoding nucleotide sequences encode secretion signals 
such that expressed P4H and product polypeptide(s) or peptide(s) are 

25 secreted. 

11. A method according to any one of the preceding claims, wherein the 
first, second and product-encoding nucleotide sequences are introduced to 
the yeast host cell such that they are present on one or more vector(s) 

30 including a CEN sequence(s). 

12. A method according to any one of the preceding claims, wherein the 
first, second and product-encoding sequences are introduced to the yeast 
host cell such that they are present on one or more vector(s) including a CEN 

35 sequence(s) and one or two high copy number vector(s). 
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13. A method according to claim 11 or 12, wherein the one or more 
vector(s) including a CEN sequence(s) are selected from YAC vectors. 

14. A method according to claim 12 or 13, wherein the one or two high 
5 copy number vector(s) are selected from YEp plasmids. 

15. A method according to claim 11, wherein the first, second and 
product-encoding nucleotide sequences are present on a single YAC vector. 

10 16. A method according to any one of the preceding claims, wherein the 
yeast host cell is selected from the genus Kluveromyces, Saccharomyces, 
Schizosaccharomyces, Yarrowia and Pichia. 

17. A yeast host cell capable of producing a hydroxylated triple helical 
15 protein, said yeast host cell including a first nucleotide sequence encoding 

P4H a subunit, a second nucleotide sequence encoding P4H P subunit and 
one or more product-encoding nucleotide sequences which encode(s) a 
polypeptide (s) or peptide(s) which, when hydroxylated, form the said 
hydroxylated triple helical protein, each of said first, second and product- 
20 encoding nucleotide sequences being operably linked to promoter sequences, 
and wherein said first, second and product-encoding nucleotide sequences, 
together with their respective operably linked promoter sequences, are borne 
on one or more replicable DNA molecules that are stably retained and 
segregated by said yeast host cell. 

25 

18. A yeast host cell according to claim 17, wherein the product- 
encoding nucleotide sequence(s) is/are nucleotide sequence(s) encoding a 
natural collagen or fragment thereof. 

30 19. A yeast host cell according to claim 18, wherein the product- 
encoding nucleotide sequence(s) is/are selected from COLlAl, COL1A2, 
COL2A1 and COL3A1 and fragments. 

20. A yeast host cell according to claim 19, wherein the product- 
35 encoding nucleotide sequence(s) is COL3A1. 



WO 98/18918 



PCT/AU97/00721 



31 

21. A yeast host cell according to claim 17, wherein the product- 
encoding nucleotide sequence(s) is/are a nucleotide sequence(s) encoding a 
synthetic polypeptide(s) or peptide(s) of the general formula: (A) r (B) ni -(Gly X 
Y) n -(C) 0 -(D) p , in which Gly is glycine, X and Y represent the same or different 

5 amino acids, the identities of which may vary from Gly X Y triplet to Gly X Y 
triplet but wherein Y must be > one proline, A and D are polypeptide or 
peptide domains which may or may not include triple helical forming (Gly X 
Y) n repeating sequences, B and C are intervening sequences which do not 
contain triple helical (Gly X Y) n repeating sequences, n is in the range of 2 to 
10 1500 and 1, m, o and p are each selected from 0 and 1. 

22. A yeast host cell according to any one of claims 17 to 21, wherein the 
first and second nucleotide sequences are expressed from a bidirectional 
promoter sequence. 

15 

23. A yeast host cell according to claim 22, wherein the bidirectional 
promoter sequence is the yeast GAL1-10 promoter sequence. 

24. A yeast host cell according to any one of claims 17 to 23, wherein the 
20 first and second nucleotide sequences are of avian or mammalian origin. 

25. A yeast host cell according to claim 24, wherein the first and second 
nucleotide sequences are of human origin. 

25 26. A yeast host cell according to any one of claims 17 to 25, wherein the 
second and product-encoding nucleotide sequences encode secretion signals 
such that expressed P4H and product polypeptide(s) or peptide (s) are 
secreted. 

30 27. A yeast host cell according to any one of the claims 17 to 26, wherein 
the first, second and product-encoding nucleotide sequences are introduced 
to the yeast host cell such that they are present on one or more vector(s) 
including a GEN sequence(s). 

35 28. A yeast host cell according to any one of claims 17 to 26, wherein the 
first, second and product-encoding sequences are introduced to the yeast 
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host cell such that they are present on one or more vector(s) including a CEN 
sequence(s) and one or two high copy number vector(s). 

29. A yeast host cell according to claim 27 or 28, wherein the one or 

5 more vector(s) including a CEN sequence(s) are selected from YAC vectors. 

30. A yeast host cell according to claim 28 or 29, wherein the one or two 
high copy number plasmid(s) are selected from YEp plasmids. 

10 31. A yeast host cell according to claim 27, wherein the first, second and 
product-encoding nucleotide sequences are present on a single YAC vector. 

32. A yeast host cell according to any one of claims 17 to 31, wherein the 
yeast host cell is selected from the genus Kluveromyces, Saccharomyces, 

15 Schizosaccharomyces, Yarrowia and Pichia. 

33. A triple helical protein produced in accordance with the method of 
any one of claims 1 to 16. 

20 34. A biomaterial or therapeutic product comprising a triple helical 

protein produced in accordance with the method of any one of claims 1 to 16. 
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Figure 2 



1 CCAGGCCCAC TTGGGATTGC TGGGATCACT GGAGCACGGG GTCTTGCAGG ACCACCAGGC 

61 ATGCCAGGTC CTAGGGGAAG CCCTGGCCCT CAGGGTGTCA AGGGTGAAAG TGGGAAACCA 

121 GGAGCTAACG GTCTCAGTGG AGAACGTGGT CCCCCTGGAC CCCAGGGTCT TCCTGGTCTG 

181 GCTGGTACAG CTGGTGAACC TGGAAGAGAT GGAAAC CCTG GATCAGATGG TCTTCCAGGT 

241 CGAGATGGAT CTCCTGGTGG CAAGGGTGAT CGTGGTGAAA ATGGCTCTCC TGGTGCCCCT 

301 GGCGCTCCTG GTCATCCAGG CCCACCTGGT CCTGTCGGTC CAGCTGGAAA GAGTGGTGAC 

361 AGAGGAGAAA GTGGCCCTGC TGGCCCTGCT GGTGCTCCCG GTCCTGCTGG TTCCCGAGGT 

421 GCTCCTGGTC CTCAAGGCCC ACGTGGTGAC AAAGGTGAAA CAGGTGAACG TGGAGC TGC T 

481 GGCATCAAAG GACATCGAGG ATTCC CTGGT AATCCAGGTG CCCCAGGTTC TCCAGGCCCT 

541 GCTGGTCAGC AGGGTGCAAT CGGCAGTCCA GGACCTGCAG GCCCCAGAGG ACCTGTTGGA 

601 CCCAGTGGAC CTCCTGGCAA AGATGGAACC AGTGGACATC CAGGTCCCAT TGGACCACCA 

661 GGGCCTCGAG GTAACAGAGG TGAAAGAGGA TCTGAGGGCT CCCCAGGCCA CCCAGGGCAA 

721 CCAGGCCCTC CTGGACCTCC TGGTGCCCCT GGTCCTTGCT GCGGTGGTGT TGGAGC CGCT 

781 GCCATTGCTG GGATTGGAGG TGAAAAAGCT GGCGGTTTTG CCCCGTATTA TGGACCTGAA 

841 CCAATGGATT TCAAAATCAA CACCGATGAG ATTATCACTT CACTCAAGTC TGTTAATGGA 

901 CAAATAGAAA GCCTCATTAG TCCTGATGGT TCTCGTAAAA ACCCCGCTAG AAACTGCAGA 

961 GAC CTGAAAT TCTGCCATCC TGAACTCAAG AC TGG AG AAT ACTGGGTCGA CCCTAACCAA 

1021 GGATGCAAAT TGGATGCTAT CAAGGTATTC TGTAATATGG AAACTGGGGA AACATGCATA 

1081 AGTGCCAATC CTTTGAATGT TCCACGGAAA CACTGGTGGA CAGATTCTAG TGCTGAGAAG 

1141 AAACACGTTT GGTTTGGAGA GTCCATCGAT GGTGGTTTTC AGTTTAGCTA CGGCAATCCT 

1201 GAACTTCCTG AAGATGTCCT TGATGTGCAG CTGGC ATTCC CTCGACTTCT CTCCAGCCGA 

1261 GCTTCCCAGA ACATCACATA TCACTGCAAA AATAGCATTG CATACATGGA TCAGGCCAGT 

1321 GGAAATGTAA AGAAGGCCCT G AAGC TGATG GGGTCAAATG AAGGTGAATT CAAGGCTGAA 

1381 GGAAATAGCA AATTCACCTA CACAGTTCTG GAGGATGGTT GCACGAAACA C AC TGGGGAA 

1441 TGGAGCAAAA CAGTCTTTGA ATATCGAACA CGCAAGGCTG TGAGACTACC TATTGTAGAT 

1501 ATTGCACCCT ATGACATTGG TGGTCCTGAT CAAGAATTTG GTGTGGACGT TGGCCCTGTT 

1561 TGCTTTTTAT AA 
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FIGURE 3 
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